Overview

Dataset statistics

Number of variables10
Number of observations3276
Missing cells1434
Missing cells (%)4.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory256.1 KiB
Average record size in memory80.0 B

Variable types

NUM9
BOOL1

Warnings

ph has 491 (15.0%) missing values Missing
Sulfate has 781 (23.8%) missing values Missing
Trihalomethanes has 162 (4.9%) missing values Missing
Hardness has unique values Unique
Solids has unique values Unique
Chloramines has unique values Unique
Conductivity has unique values Unique
Organic_carbon has unique values Unique
Turbidity has unique values Unique

Reproduction

Analysis started2022-04-12 19:18:29.424274
Analysis finished2022-04-12 19:18:43.829119
Duration14.4 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

ph
Real number (ℝ≥0)

MISSING

Distinct2785
Distinct (%)100.0%
Missing491
Missing (%)15.0%
Infinite0
Infinite (%)0.0%
Mean7.080794504
Minimum0
Maximum14
Zeros1
Zeros (%)< 0.1%
Memory size25.7 KiB
2022-04-12T21:18:44.002224image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.487970742
Q16.093091914
median7.036752104
Q38.062066123
95-th percentile9.789818577
Maximum14
Range14
Interquartile range (IQR)1.968974209

Descriptive statistics

Standard deviation1.594319519
Coefficient of variation (CV)0.2251611055
Kurtosis0.7203155798
Mean7.080794504
Median Absolute Deviation (MAD)0.984116999
Skewness0.02563044763
Sum19720.01269
Variance2.541854728
MonotocityNot monotonic
2022-04-12T21:18:44.134655image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
8.554096971< 0.1%
 
6.5380840871< 0.1%
 
5.915806751< 0.1%
 
8.1364978691< 0.1%
 
6.4937641751< 0.1%
 
6.9774056331< 0.1%
 
5.4892480551< 0.1%
 
2.5581027991< 0.1%
 
7.3121093041< 0.1%
 
6.7044319131< 0.1%
 
6.448971091< 0.1%
 
8.0283042421< 0.1%
 
8.6168244261< 0.1%
 
9.467129031< 0.1%
 
6.7936986351< 0.1%
 
5.4860586021< 0.1%
 
5.050748381< 0.1%
 
6.2461175651< 0.1%
 
8.1423309131< 0.1%
 
5.1894136691< 0.1%
 
7.8795432341< 0.1%
 
8.8019335551< 0.1%
 
7.0380923061< 0.1%
 
5.4233184961< 0.1%
 
4.8385711071< 0.1%
 
Other values (2760)276084.2%
 
(Missing)49115.0%
 
ValueCountFrequency (%) 
01< 0.1%
 
0.22749905021< 0.1%
 
0.97557798981< 0.1%
 
0.98991221291< 0.1%
 
1.4317815551< 0.1%
 
1.7570371151< 0.1%
 
1.8445383661< 0.1%
 
1.9853833591< 0.1%
 
2.1285314341< 0.1%
 
2.3767680761< 0.1%
 
ValueCountFrequency (%) 
141< 0.1%
 
13.541240241< 0.1%
 
13.349888561< 0.1%
 
13.175401721< 0.1%
 
12.246928071< 0.1%
 
11.907739831< 0.1%
 
11.898078031< 0.1%
 
11.621140131< 0.1%
 
11.568767971< 0.1%
 
11.563169061< 0.1%
 

Hardness
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean196.369496
Minimum47.432
Maximum323.124
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:44.278032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum47.432
5-th percentile141.7632807
Q1176.8505379
median196.9676269
Q3216.6674562
95-th percentile249.6097689
Maximum323.124
Range275.692
Interquartile range (IQR)39.81691834

Descriptive statistics

Standard deviation32.87976148
Coefficient of variation (CV)0.1674382332
Kurtosis0.6157716821
Mean196.369496
Median Absolute Deviation (MAD)19.84498917
Skewness-0.03934170478
Sum643306.469
Variance1081.078715
MonotocityNot monotonic
2022-04-12T21:18:44.406000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
204.89045551< 0.1%
 
134.56027611< 0.1%
 
170.19091231< 0.1%
 
237.46109921< 0.1%
 
171.23892551< 0.1%
 
197.42819881< 0.1%
 
195.74407411< 0.1%
 
184.23185351< 0.1%
 
187.87328351< 0.1%
 
205.15056441< 0.1%
 
205.33854561< 0.1%
 
205.56349951< 0.1%
 
147.4905751< 0.1%
 
260.84503931< 0.1%
 
199.81299891< 0.1%
 
191.70841361< 0.1%
 
148.84212931< 0.1%
 
194.71918591< 0.1%
 
204.78373471< 0.1%
 
168.04246511< 0.1%
 
228.76294521< 0.1%
 
169.21440751< 0.1%
 
227.22575071< 0.1%
 
229.72534841< 0.1%
 
169.3338431< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
47.4321< 0.1%
 
73.492233691< 0.1%
 
77.45958611< 0.1%
 
81.710895271< 0.1%
 
94.091307481< 0.1%
 
94.812545221< 0.1%
 
94.908977131< 0.1%
 
97.28090861< 0.1%
 
98.36791491< 0.1%
 
98.452930511< 0.1%
 
ValueCountFrequency (%) 
323.1241< 0.1%
 
317.33812411< 0.1%
 
311.38395651< 0.1%
 
308.25383291< 0.1%
 
307.70602411< 0.1%
 
306.62748141< 0.1%
 
304.23591211< 0.1%
 
303.70262671< 0.1%
 
300.29247581< 0.1%
 
298.09867951< 0.1%
 

Solids
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22014.09253
Minimum320.9426113
Maximum61227.19601
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:44.540384image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum320.9426113
5-th percentile9545.812579
Q115666.6903
median20927.83361
Q327332.76213
95-th percentile38474.99025
Maximum61227.19601
Range60906.2534
Interquartile range (IQR)11666.07183

Descriptive statistics

Standard deviation8768.570828
Coefficient of variation (CV)0.3983162521
Kurtosis0.4428260858
Mean22014.09253
Median Absolute Deviation (MAD)5809.471858
Skewness0.6216344855
Sum72118167.12
Variance76887834.36
MonotocityNot monotonic
2022-04-12T21:18:44.673807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20791.318981< 0.1%
 
15979.334791< 0.1%
 
37000.955671< 0.1%
 
18736.19091< 0.1%
 
12289.900921< 0.1%
 
15979.060271< 0.1%
 
12431.803111< 0.1%
 
30031.839181< 0.1%
 
29532.6151< 0.1%
 
19821.338371< 0.1%
 
25142.733741< 0.1%
 
16100.967951< 0.1%
 
21316.506731< 0.1%
 
11803.73551< 0.1%
 
14540.735081< 0.1%
 
32112.569871< 0.1%
 
13329.032251< 0.1%
 
18344.069441< 0.1%
 
20408.48561< 0.1%
 
18564.372061< 0.1%
 
19126.298541< 0.1%
 
33365.315421< 0.1%
 
14470.053551< 0.1%
 
22444.559411< 0.1%
 
19168.526771< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
320.94261131< 0.1%
 
728.75082961< 0.1%
 
1198.9436991< 0.1%
 
1351.9069791< 0.1%
 
1372.0910431< 0.1%
 
2552.9628041< 0.1%
 
2808.0257561< 0.1%
 
2835.3031651< 0.1%
 
2912.2112471< 0.1%
 
3413.0816331< 0.1%
 
ValueCountFrequency (%) 
61227.196011< 0.1%
 
56867.859241< 0.1%
 
56488.672411< 0.1%
 
56351.39631< 0.1%
 
56320.586981< 0.1%
 
55334.70281< 0.1%
 
53735.899191< 0.1%
 
52318.91731< 0.1%
 
52060.22681< 0.1%
 
51731.820551< 0.1%
 

Chloramines
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.122276793
Minimum0.352
Maximum13.127
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:44.816656image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.352
5-th percentile4.50305371
Q16.127420755
median7.130298974
Q38.114887032
95-th percentile9.753100546
Maximum13.127
Range12.775
Interquartile range (IQR)1.987466277

Descriptive statistics

Standard deviation1.583084889
Coefficient of variation (CV)0.2222723063
Kurtosis0.5899011686
Mean7.122276793
Median Absolute Deviation (MAD)0.9916613427
Skewness-0.01209844012
Sum23332.57878
Variance2.506157766
MonotocityNot monotonic
2022-04-12T21:18:44.955040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7.3002118731< 0.1%
 
9.5043610271< 0.1%
 
6.2172225421< 0.1%
 
5.5998703421< 0.1%
 
10.786499821< 0.1%
 
7.4249445911< 0.1%
 
6.66161621< 0.1%
 
6.215307311< 0.1%
 
7.9810368991< 0.1%
 
6.3449634121< 0.1%
 
5.6395010731< 0.1%
 
5.5272992461< 0.1%
 
9.1422336661< 0.1%
 
5.2606700051< 0.1%
 
8.8274135541< 0.1%
 
8.1153550821< 0.1%
 
7.1184654031< 0.1%
 
7.6118366671< 0.1%
 
4.5315812241< 0.1%
 
8.5621563821< 0.1%
 
7.0175783591< 0.1%
 
8.4604897871< 0.1%
 
8.4715089551< 0.1%
 
5.7021749231< 0.1%
 
8.0814961621< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
0.3521< 0.1%
 
0.53035129471< 0.1%
 
1.3908709051< 0.1%
 
1.6839925811< 0.1%
 
1.9202714491< 0.1%
 
2.1026909911< 0.1%
 
2.3866534941< 0.1%
 
2.397984991< 0.1%
 
2.4560135961< 0.1%
 
2.4586091951< 0.1%
 
ValueCountFrequency (%) 
13.1271< 0.1%
 
13.043806111< 0.1%
 
12.912186641< 0.1%
 
12.653362021< 0.1%
 
12.626899741< 0.1%
 
12.580026491< 0.1%
 
12.363284831< 0.1%
 
12.279374181< 0.1%
 
12.24639411< 0.1%
 
12.227175281< 0.1%
 

Sulfate
Real number (ℝ≥0)

MISSING

Distinct2495
Distinct (%)100.0%
Missing781
Missing (%)23.8%
Infinite0
Infinite (%)0.0%
Mean333.7757766
Minimum129
Maximum481.0306423
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:45.090944image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum129
5-th percentile266.6162317
Q1307.6994978
median333.0735457
Q3359.9501704
95-th percentile403.0701898
Maximum481.0306423
Range352.0306423
Interquartile range (IQR)52.25067255

Descriptive statistics

Standard deviation41.41684046
Coefficient of variation (CV)0.1240858186
Kurtosis0.648262815
Mean333.7757766
Median Absolute Deviation (MAD)26.0951759
Skewness-0.03594662161
Sum832770.5626
Variance1715.354674
MonotocityNot monotonic
2022-04-12T21:18:45.219442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
280.74562291< 0.1%
 
332.74451921< 0.1%
 
391.91822861< 0.1%
 
330.90537041< 0.1%
 
402.31342711< 0.1%
 
360.69781511< 0.1%
 
336.04045181< 0.1%
 
405.52733721< 0.1%
 
346.06367681< 0.1%
 
368.51644131< 0.1%
 
273.71928191< 0.1%
 
380.72533581< 0.1%
 
373.65279061< 0.1%
 
274.49339551< 0.1%
 
298.37997641< 0.1%
 
296.09127321< 0.1%
 
371.36185121< 0.1%
 
366.21442791< 0.1%
 
301.23084821< 0.1%
 
365.89533771< 0.1%
 
342.60408281< 0.1%
 
297.2392191< 0.1%
 
274.65878811< 0.1%
 
299.95884551< 0.1%
 
325.32395491< 0.1%
 
Other values (2470)247075.4%
 
(Missing)78123.8%
 
ValueCountFrequency (%) 
1291< 0.1%
 
180.20674641< 0.1%
 
182.39737021< 0.1%
 
187.17071441< 0.1%
 
187.42413091< 0.1%
 
192.03359171< 0.1%
 
203.44452081< 0.1%
 
205.93509061< 0.1%
 
206.24722941< 0.1%
 
207.89048231< 0.1%
 
ValueCountFrequency (%) 
481.03064231< 0.1%
 
476.53971731< 0.1%
 
475.73746021< 0.1%
 
462.4742151< 0.1%
 
460.1070691< 0.1%
 
458.44107231< 0.1%
 
455.45123371< 0.1%
 
450.91445441< 0.1%
 
449.26768751< 0.1%
 
447.41796241< 0.1%
 

Conductivity
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean426.2051107
Minimum181.483754
Maximum753.3426196
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:45.352867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum181.483754
5-th percentile300.1094657
Q1365.7344141
median421.8849683
Q3481.7923045
95-th percentile566.3493198
Maximum753.3426196
Range571.8588656
Interquartile range (IQR)116.0578904

Descriptive statistics

Standard deviation80.82406405
Coefficient of variation (CV)0.1896365436
Kurtosis-0.2770928329
Mean426.2051107
Median Absolute Deviation (MAD)57.88759119
Skewness0.2644902239
Sum1396247.943
Variance6532.52933
MonotocityNot monotonic
2022-04-12T21:18:45.480304image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
564.30865421< 0.1%
 
418.64206281< 0.1%
 
517.57676191< 0.1%
 
235.04228351< 0.1%
 
501.55972521< 0.1%
 
452.18723261< 0.1%
 
367.85402481< 0.1%
 
400.61189911< 0.1%
 
469.13211691< 0.1%
 
482.59570931< 0.1%
 
528.28002361< 0.1%
 
532.3420831< 0.1%
 
406.37201891< 0.1%
 
424.05841841< 0.1%
 
486.98373431< 0.1%
 
518.94436221< 0.1%
 
403.42099571< 0.1%
 
320.35601391< 0.1%
 
515.57509711< 0.1%
 
419.13157651< 0.1%
 
383.52702311< 0.1%
 
449.72395171< 0.1%
 
360.75567431< 0.1%
 
389.03088881< 0.1%
 
350.57737021< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
181.4837541< 0.1%
 
201.61973681< 0.1%
 
210.3191821< 0.1%
 
217.35832961< 0.1%
 
232.6136241< 0.1%
 
233.90796511< 0.1%
 
235.04228351< 0.1%
 
245.8596321< 0.1%
 
247.91803051< 0.1%
 
251.02089871< 0.1%
 
ValueCountFrequency (%) 
753.34261961< 0.1%
 
708.22636451< 0.1%
 
695.3695281< 0.1%
 
674.44347591< 0.1%
 
672.55699921< 0.1%
 
669.72508621< 0.1%
 
666.69061831< 0.1%
 
660.25494631< 0.1%
 
657.57042181< 0.1%
 
656.92412781< 0.1%
 

Organic_carbon
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.28497025
Minimum2.2
Maximum28.3
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:45.633072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.2
5-th percentile8.815361702
Q112.06580133
median14.21833794
Q316.55765154
95-th percentile19.63725445
Maximum28.3
Range26.1
Interquartile range (IQR)4.49185021

Descriptive statistics

Standard deviation3.308161999
Coefficient of variation (CV)0.2315834014
Kurtosis0.04440930715
Mean14.28497025
Median Absolute Deviation (MAD)2.232294118
Skewness0.02553258209
Sum46797.56253
Variance10.94393581
MonotocityNot monotonic
2022-04-12T21:18:45.786860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10.379783081< 0.1%
 
12.897635451< 0.1%
 
15.871769791< 0.1%
 
11.5454771< 0.1%
 
12.284333521< 0.1%
 
18.584959371< 0.1%
 
21.300646941< 0.1%
 
15.288781631< 0.1%
 
16.16921171< 0.1%
 
12.164735681< 0.1%
 
12.911509231< 0.1%
 
10.34657411< 0.1%
 
11.519669081< 0.1%
 
12.318942421< 0.1%
 
17.271655781< 0.1%
 
8.5675795871< 0.1%
 
20.794371331< 0.1%
 
12.33323711< 0.1%
 
21.558862611< 0.1%
 
18.45989021< 0.1%
 
14.759256581< 0.1%
 
10.396795641< 0.1%
 
19.40214941< 0.1%
 
15.5015431< 0.1%
 
15.177533971< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
2.21< 0.1%
 
4.3718986081< 0.1%
 
4.4667719691< 0.1%
 
4.4730922641< 0.1%
 
4.8616314981< 0.1%
 
4.9028880681< 0.1%
 
4.9668616191< 0.1%
 
5.0516946151< 0.1%
 
5.1593803081< 0.1%
 
5.1884664551< 0.1%
 
ValueCountFrequency (%) 
28.31< 0.1%
 
27.006706611< 0.1%
 
24.755392371< 0.1%
 
23.952450441< 0.1%
 
23.917601261< 0.1%
 
23.667666781< 0.1%
 
23.604297971< 0.1%
 
23.569644911< 0.1%
 
23.514773771< 0.1%
 
23.399516061< 0.1%
 

Trihalomethanes
Real number (ℝ≥0)

MISSING

Distinct3114
Distinct (%)100.0%
Missing162
Missing (%)4.9%
Infinite0
Infinite (%)0.0%
Mean66.39629295
Minimum0.738
Maximum124
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:45.926728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.738
5-th percentile39.55292835
Q155.84453562
median66.6224851
Q377.33747291
95-th percentile92.12405947
Maximum124
Range123.262
Interquartile range (IQR)21.49293729

Descriptive statistics

Standard deviation16.17500842
Coefficient of variation (CV)0.2436131251
Kurtosis0.2385974402
Mean66.39629295
Median Absolute Deviation (MAD)10.74217213
Skewness-0.08303067408
Sum206758.0562
Variance261.6308975
MonotocityNot monotonic
2022-04-12T21:18:46.078016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
86.990970461< 0.1%
 
56.715509551< 0.1%
 
77.730814371< 0.1%
 
90.394894721< 0.1%
 
37.787096641< 0.1%
 
78.92552711< 0.1%
 
89.477718371< 0.1%
 
69.5267181< 0.1%
 
72.573959381< 0.1%
 
57.780869321< 0.1%
 
83.277583261< 0.1%
 
70.135607861< 0.1%
 
58.062462551< 0.1%
 
70.346102291< 0.1%
 
75.114888181< 0.1%
 
70.547217641< 0.1%
 
59.700209661< 0.1%
 
41.275926591< 0.1%
 
74.326897981< 0.1%
 
40.290591371< 0.1%
 
34.246212241< 0.1%
 
59.708017051< 0.1%
 
54.473934621< 0.1%
 
68.911184791< 0.1%
 
68.799178521< 0.1%
 
Other values (3089)308994.3%
 
(Missing)1624.9%
 
ValueCountFrequency (%) 
0.7381< 0.1%
 
8.1758763841< 0.1%
 
8.5770129331< 0.1%
 
14.343161451< 0.1%
 
15.68487681< 0.1%
 
16.29150461< 0.1%
 
17.000682931< 0.1%
 
17.527764961< 0.1%
 
17.915722571< 0.1%
 
18.015272361< 0.1%
 
ValueCountFrequency (%) 
1241< 0.1%
 
120.0300771< 0.1%
 
118.35727471< 0.1%
 
116.16162161< 0.1%
 
114.20867141< 0.1%
 
114.03494571< 0.1%
 
113.04888571< 0.1%
 
112.6227331< 0.1%
 
112.41221041< 0.1%
 
112.06102741< 0.1%
 

Turbidity
Real number (ℝ≥0)

UNIQUE

Distinct3276
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.96678617
Minimum1.45
Maximum6.739
Zeros0
Zeros (%)0.0%
Memory size25.7 KiB
2022-04-12T21:18:46.331936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.45
5-th percentile2.684279234
Q13.43971087
median3.955027563
Q34.500319787
95-th percentile5.220924525
Maximum6.739
Range5.289
Interquartile range (IQR)1.060608918

Descriptive statistics

Standard deviation0.7803824085
Coefficient of variation (CV)0.1967291341
Kurtosis-0.06280064052
Mean3.96678617
Median Absolute Deviation (MAD)0.5302962358
Skewness-0.007816642377
Sum12995.19149
Variance0.6089967035
MonotocityNot monotonic
2022-04-12T21:18:46.474809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2.9631353811< 0.1%
 
3.9870120911< 0.1%
 
4.0662293641< 0.1%
 
3.7593262011< 0.1%
 
4.8762731< 0.1%
 
5.1437501221< 0.1%
 
4.5132005391< 0.1%
 
4.204185851< 0.1%
 
4.5867483591< 0.1%
 
4.9109110211< 0.1%
 
4.8948296511< 0.1%
 
5.3275845321< 0.1%
 
2.8172581171< 0.1%
 
2.8160747391< 0.1%
 
5.1238719351< 0.1%
 
3.563609861< 0.1%
 
3.3953317431< 0.1%
 
4.5153223341< 0.1%
 
3.915990811< 0.1%
 
2.9624329711< 0.1%
 
4.5845656631< 0.1%
 
3.1641879941< 0.1%
 
4.8229580461< 0.1%
 
3.851154241< 0.1%
 
3.7359834761< 0.1%
 
Other values (3251)325199.2%
 
ValueCountFrequency (%) 
1.451< 0.1%
 
1.4922066151< 0.1%
 
1.4961009431< 0.1%
 
1.641515011< 0.1%
 
1.6597993851< 0.1%
 
1.6805540251< 0.1%
 
1.6876245051< 0.1%
 
1.8013269991< 0.1%
 
1.812528941< 0.1%
 
1.8443716041< 0.1%
 
ValueCountFrequency (%) 
6.7391< 0.1%
 
6.4947485561< 0.1%
 
6.4942494671< 0.1%
 
6.3891610091< 0.1%
 
6.357438521< 0.1%
 
6.3076784721< 0.1%
 
6.2265804051< 0.1%
 
6.2048463591< 0.1%
 
6.0996318731< 0.1%
 
6.0837723541< 0.1%
 

Potability
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size25.7 KiB
0
1998 
1
1278 
ValueCountFrequency (%) 
0199861.0%
 
1127839.0%
 
2022-04-12T21:18:46.572519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Interactions

2022-04-12T21:18:33.887191image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.014167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.127255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.319208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.433785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.545879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.653543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.762632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.874231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:34.981367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.087513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.188256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.301289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.405447image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.516553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.645511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.752151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.862759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:35.970887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.084969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.193625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.307671image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.414809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.532391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.662347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.760552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.863226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:36.964375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.073033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.170278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.273022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.480711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.585408image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.696508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.799176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:37.912727image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.016391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.129017image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.243593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.351720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.469264image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.588808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.697949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.798141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:38.903271image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.004943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.109608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.221707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.336248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.457304image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.585239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.710232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.830760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:39.953270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.070359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.171510image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.279638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.391271image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.492474image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.711192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.834663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:40.952214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.059846image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.165000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.274118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.377286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.502311image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.607431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.713606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.815254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:41.920406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.024566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.126279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.225942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.332086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.457574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.570166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.681270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.796838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:42.896090image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:43.000230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-12T21:18:46.637496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-12T21:18:46.814049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-12T21:18:46.981695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-12T21:18:47.150831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-12T21:18:43.220918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:43.511609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:43.662464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-12T21:18:43.740336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Sample

First rows

phHardnessSolidsChloraminesSulfateConductivityOrganic_carbonTrihalomethanesTurbidityPotability
0NaN204.89045520791.3189817.300212368.516441564.30865410.37978386.9909702.9631350
13.716080129.42292118630.0578586.635246NaN592.88535915.18001356.3290764.5006560
28.099124224.23625919909.5417329.275884NaN418.60621316.86863766.4200933.0559340
38.316766214.37339422018.4174418.059332356.886136363.26651618.436524100.3416744.6287710
49.092223181.10150917978.9863396.546600310.135738398.41081311.55827931.9979934.0750750
55.584087188.31332428748.6877397.544869326.678363280.4679168.39973554.9178622.5597080
610.223862248.07173528749.7165447.513408393.663396283.65163413.78969584.6035562.6729890
78.635849203.36152313672.0917644.563009303.309771474.60764512.36381762.7983094.4014250
8NaN118.98857914285.5838547.804174268.646941389.37556612.70604953.9288463.5950170
911.180284227.23146925484.5084919.077200404.041635563.88548117.92780671.9766014.3705620

Last rows

phHardnessSolidsChloraminesSulfateConductivityOrganic_carbonTrihalomethanesTurbidityPotability
32668.372910169.08705214622.7454947.547984NaN464.52555211.08302738.4351514.9063581
32678.989900215.04735815921.4120186.297312312.931022390.4102319.89911555.0693044.6138431
32686.702547207.32108617246.9203477.708117304.510230329.26600216.21730328.8786013.4429831
326911.49101194.81254537188.8260229.263166258.930600439.89361816.17275541.5585014.3692641
32706.069616186.65904026138.7801917.747547345.700257415.88695512.06762060.4199213.6697121
32714.668102193.68173547580.9916037.166639359.948574526.42417113.89441966.6876954.4358211
32727.808856193.55321217329.8021608.061362NaN392.44958019.903225NaN2.7982431
32739.419510175.76264633155.5782187.350233NaN432.04478311.03907069.8454003.2988751
32745.126763230.60375811983.8693766.303357NaN402.88311311.16894677.4882134.7086581
32757.874671195.10229917404.1770617.509306NaN327.45976016.14036878.6984462.3091491